On the use of a fuzzy classifier to speed up the Sp_ToBI labeling of the Glissando Spanish corpus

نویسندگان

  • David Escudero Mancebo
  • Lourdes Aguilar
  • César González Ferreras
  • Yurena Gutiérrez-González
  • Valentín Cardeñoso-Payo
چکیده

In this paper, we present the application of a novel automatic prosodic labeling methodology for speeding up the manual labeling of the Glissando corpus (Spanish read news items). The methodology is based on the use of soft classification techniques. The output of the automatic system consists on a set of label candidates per word. The number of predicted candidates depends on the degree of certainty assigned by the classifier to each of the predictions. The manual transcriber checks the sets of predictions to select the correct one. We describe the fundamentals of the fuzzy classification tool and its training with a corpus labeled with Sp TOBI labels. Results show a clear coherence between the most confused labels in the output of the automatic classifier and the most confused labels detected in inter-transcriber consistency tests. More importantly, in a preliminary test, the real time ratio of the labeling process was 1:66 when the template of predictions is used and 1:80 when it is not.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cultural Influence on the Expression of Cathartic Conceptualization in English and Spanish: A Corpus-Based Analysis

This paper investigates the conceptualization of emotional release from a cognitive linguistics perspective (Cognitive Metaphor Theory). The metaphor weeping is a means of liberating contained emotions is grounded in universal embodied cognition and is reflected in linguistic expressions in English and Spanish. Lexicalization patterns which encapsulate this conceptualization i...

متن کامل

Arabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents

Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...

متن کامل

Thyroid disorder diagnosis based on Mamdani fuzzy inference system classifier

Introduction: Classification and prediction are two most important applications of statistical methods in the field of medicine. According to this note that the classical classification are provided due to the clinical symptom and  do not involve the use of specialized information and knowledge. Therefore, using a classifier that can combine all this information, is necessary. The aim of this s...

متن کامل

A research on classification performance of fuzzy classifiers based on fuzzy set theory

Due to the complexities of objects and the vagueness of the human mind, it has attracted considerable attention from researchers studying fuzzy classification algorithms. In this paper, we propose a concept of fuzzy relative entropy to measure the divergence between two fuzzy sets. Applying fuzzy relative entropy, we prove the conclusion that patterns with high fuzziness are close to the classi...

متن کامل

On the use of Heronian means in a similarity classifier

This paper introduces new similarity classifiers using the Heronian mean, and the generalized Heronian mean operators. We examine the use of these operators at the aggregation step within the similarity classifier. The similarity classifier was earlier studied with other operators, in particular with an arithmetic mean, generalized mean, OWA operators, and many more. The two classifiers here ar...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014